Fix reformer CI by ydshieh · Pull Request #21254 · huggingface/transformers

ydshieh · 2023-01-23T12:57:30Z

What does this PR do?

Some fixes are required for doctest after #21199. See comments in the review.

fix ReformerForMaskedLM doc example

ydshieh · 2023-01-23T13:01:37Z

src/transformers/models/reformer/modeling_reformer.py

        >>> tokenizer.add_special_tokens({"mask_token": "[MASK]"})  # doctest: +IGNORE_RESULT
        >>> inputs = tokenizer("The capital of France is [MASK].", return_tensors="pt")

+        >>> # resize model's embedding matrix


This tiny model checkpoint has some issue. Using fast tokenizer, we get vocab size 1000, while using slow tokenizer, it is 320 (something like that).

This is due to the fact the tiny model creation (the old version) sometimes fail to convert a fast tokenizer to slow tokenizer, and it's not clear to me if the slow tokenizer was uploaded separately after the creation, or if there is something strange and the creation gives a slow tokenizer anyway but with smaller vocab size.

(it's would be better to eventually use the v2 of tiny model creation script though for doctest if necessary)

ydshieh · 2023-01-23T13:03:49Z

src/transformers/models/reformer/modeling_reformer.py

        >>> inputs = tokenizer("The capital of France is [MASK].", return_tensors="pt")

+        >>> # resize model's embedding matrix
+        >>> model.resize_token_embeddings(new_num_tokens=model.config.vocab_size+1)  # doctest: +IGNORE_RESULT


resize to avoid out of vocab error after the PR #21199 (as it loads the fast tokenizer via AutoTokenizer, which has 1000 tokens as in model config)

ydshieh · 2023-01-23T13:04:00Z

src/transformers/models/reformer/modeling_reformer.py

        >>> predicted_token_id = logits[0, mask_token_index].argmax(axis=-1)
-        >>> tokenizer.decode(predicted_token_id)
-        'it'
+        >>> predicted_token = tokenizer.decode(predicted_token_id)


output is random

ydshieh · 2023-01-23T13:04:08Z

src/transformers/models/reformer/modeling_reformer.py

        >>> outputs = model(**inputs, labels=labels)
-        >>> round(outputs.loss.item(), 2)
-        7.09
+        >>> loss = round(outputs.loss.item(), 2)


output is random

ydshieh · 2023-01-23T13:04:48Z

src/transformers/models/reformer/modeling_reformer.py

        >>> predicted_class_id = logits.argmax().item()
-        >>> model.config.id2label[predicted_class_id]
-        'LABEL_0'
+        >>> label = model.config.id2label[predicted_class_id]


output is random, as this checkpoint is not seq. classification model.

ydshieh · 2023-01-23T13:04:59Z

src/transformers/models/reformer/modeling_reformer.py

        >>> labels = torch.tensor(1)
        >>> loss = model(**inputs, labels=labels).loss
-        >>> round(loss.item(), 2)
-        0.68


output is random, as this checkpoint is not seq. classification model.

HuggingFaceDocBuilderDev · 2023-01-23T13:21:10Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Thnaks for the fixes!

ydshieh added 3 commits January 23, 2023 13:52

fix ReformerForSequenceClassification doc example

0d42112

fix ReformerForMaskedLM doc example

fix ReformerForSequenceClassification doc example

4603a6c

clean up

c98f4d7

ydshieh commented Jan 23, 2023

View reviewed changes

clean up

f4b9097

ydshieh requested a review from sgugger January 23, 2023 13:09

sgugger approved these changes Jan 23, 2023

View reviewed changes

ydshieh merged commit cb6b568 into main Jan 23, 2023

ydshieh deleted the fix_reformer_ci branch January 23, 2023 14:34

ydshieh mentioned this pull request Jan 25, 2023

[Doctest] Fix Blenderbot doctest #21297

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix reformer CI#21254

Fix reformer CI#21254
ydshieh merged 4 commits intomainfrom
fix_reformer_ci

ydshieh commented Jan 23, 2023

Uh oh!

ydshieh Jan 23, 2023

Uh oh!

ydshieh Jan 23, 2023

Uh oh!

ydshieh Jan 23, 2023

Uh oh!

ydshieh Jan 23, 2023

Uh oh!

ydshieh Jan 23, 2023

Uh oh!

ydshieh Jan 23, 2023

Uh oh!

ydshieh Jan 23, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Jan 23, 2023 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ydshieh commented Jan 23, 2023

What does this PR do?

Uh oh!

ydshieh Jan 23, 2023

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 23, 2023

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 23, 2023

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 23, 2023

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 23, 2023

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 23, 2023

Choose a reason for hiding this comment

Uh oh!

ydshieh Jan 23, 2023

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jan 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HuggingFaceDocBuilderDev commented Jan 23, 2023 •

edited

Loading